14 research outputs found
A trust region-type normal map-based semismooth Newton method for nonsmooth nonconvex composite optimization
We propose a novel trust region method for solving a class of nonsmooth and
nonconvex composite-type optimization problems. The approach embeds inexact
semismooth Newton steps for finding zeros of a normal map-based stationarity
measure for the problem in a trust region framework. Based on a new merit
function and acceptance mechanism, global convergence and transition to fast
local q-superlinear convergence are established under standard conditions. In
addition, we verify that the proposed trust region globalization is compatible
with the Kurdyka-{\L}ojasiewicz (KL) inequality yielding finer convergence
results. We further derive new normal map-based representations of the
associated second-order optimality conditions that have direct connections to
the local assumptions required for fast convergence. Finally, we study the
behavior of our algorithm when the Hessian matrix of the smooth part of the
objective function is approximated by BFGS updates. We successfully link the KL
theory, properties of the BFGS approximations, and a Dennis-Mor{\'e}-type
condition to show superlinear convergence of the quasi-Newton version of our
method. Numerical experiments on sparse logistic regression and image
compression illustrate the efficiency of the proposed algorithm.Comment: 56 page
Variational Properties of Decomposable Functions Part II: Strong Second-Order Theory
Local superlinear convergence of the semismooth Newton method usually
requires the uniform invertibility of the generalized Jacobian matrix, e.g.
BD-regularity or CD-regularity. For several types of nonlinear programming and
composite-type optimization problems -- for which the generalized Jacobian of
the stationary equation can be calculated explicitly -- this is characterized
by the strong second-order sufficient condition. However, general
characterizations are still not well understood. In this paper, we propose a
strong second-order sufficient condition (SSOSC) for composite problems whose
nonsmooth part has a generalized conic-quadratic second subderivative. We then
discuss the relationship between the SSOSC and another second order-type
condition that involves the generalized Jacobians of the normal map. In
particular, these two conditions are equivalent under certain structural
assumptions on the generalized Jacobian matrix of the proximity operator. Next,
we verify these structural assumptions for -strictly decomposable
functions via analyzing their second-order variational properties under
additional geometric assumptions on the support set of the decomposition pair.
Finally, we show that the SSOSC is further equivalent to the strong metric
regularity condition of the subdifferential, the normal map, and the natural
residual. Counterexamples illustrate the necessity of our assumptions.Comment: 28 pages; preliminary draf
A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction
We develop an implementable stochastic proximal point (SPP) method for a
class of weakly convex, composite optimization problems. The proposed
stochastic proximal point algorithm incorporates a variance reduction mechanism
and the resulting SPP updates are solved using an inexact semismooth Newton
framework. We establish detailed convergence results that take the inexactness
of the SPP steps into account and that are in accordance with existing
convergence guarantees of (proximal) stochastic variance-reduced gradient
methods. Numerical experiments show that the proposed algorithm competes
favorably with other state-of-the-art methods and achieves higher robustness
with respect to the step size selection
Convergence of Random Reshuffling Under The Kurdyka-{\L}ojasiewicz Inequality
We study the random reshuffling (RR) method for smooth nonconvex optimization
problems with a finite-sum structure. Though this method is widely utilized in
practice such as the training of neural networks, its convergence behavior is
only understood in several limited settings. In this paper, under the
well-known Kurdyka-Lojasiewicz (KL) inequality, we establish strong limit-point
convergence results for RR with appropriate diminishing step sizes, namely, the
whole sequence of iterates generated by RR is convergent and converges to a
single stationary point in an almost sure sense. In addition, we derive the
corresponding rate of convergence, depending on the KL exponent and the
suitably selected diminishing step sizes. When the KL exponent lies in
, the convergence is at a rate of with
counting the iteration number. When the KL exponent belongs to ,
our derived convergence rate is of the form with depending on the KL exponent. The standard KL inequality-based
convergence analysis framework only applies to algorithms with a certain
descent property. We conduct a novel convergence analysis for the non-descent
RR method with diminishing step sizes based on the KL inequality, which
generalizes the standard KL framework. We summarize our main steps and core
ideas in an informal analysis framework, which is of independent interest. As a
direct application of this framework, we also establish similar strong
limit-point convergence results for the reshuffled proximal point method.Comment: 23 page
Nonmonotone globalization for Anderson acceleration via adaptive regularization
Anderson acceleration (AA) is a popular method for accelerating fixed-point iterations, but may suffer from instability and stagnation. We propose a globalization method for AA to improve stability and achieve unified global and local convergence. Unlike existing AA globalization approaches that rely on safeguarding operations and might hinder fast local convergence, we adopt a nonmonotone trust-region framework and introduce an adaptive quadratic regularization together with a tailored acceptance mechanism. We prove global convergence and show that our algorithm attains the same local convergence as AA under appropriate assumptions. The effectiveness of our method is demonstrated in several numerical experiments